Hillah
Text Generation Models for Luxembourgish with Limited Data: A Balanced Multilingual Strategy
Plum, Alistair, Ranasinghe, Tharindu, Purschke, Christoph
This paper addresses the challenges in developing language models for less-represented languages, with a focus on Luxembourgish. Despite its active development, Luxembourgish faces a digital data scarcity, exacerbated by Luxembourg's multilingual context. We propose a novel text generation model based on the T5 architecture, combining limited Luxembourgish data with equal amounts, in terms of size and type, of German and French data. We hypothesise that a model trained on Luxembourgish, German, and French will improve the model's cross-lingual transfer learning capabilities and outperform monolingual and large multilingual models. To verify this, the study at hand explores whether multilingual or monolingual training is more beneficial for Luxembourgish language generation. For the evaluation, we introduce LuxGen, a text generation benchmark that is the first of its kind for Luxembourgish.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Germany > Saxony > Leipzig (0.05)
- (16 more...)
Neural Text Normalization for Luxembourgish using Real-Life Variation Data
Lutgen, Anne-Marie, Plum, Alistair, Purschke, Christoph, Plank, Barbara
Orthographic variation is very common in Luxembourgish texts due to the absence of a fully-fledged standard variety. Additionally, developing NLP tools for Luxembourgish is a difficult task given the lack of annotated and parallel data, which is exacerbated by ongoing standardization. In this paper, we propose the first sequence-to-sequence normalization models using the ByT5 and mT5 architectures with training data obtained from word-level real-life variation data. We perform a fine-grained, linguistically-motivated evaluation to test byte-based, word-based and pipeline-based models for their strengths and weaknesses in text normalization. We show that our sequence model using real-life variation data is an effective approach for tailor-made normalization in Luxembourgish.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > Middle East > Iraq > Babil Governorate > Hillah (0.04)
- (12 more...)
LuxBank: The First Universal Dependency Treebank for Luxembourgish
Plum, Alistair, Döhmer, Caroline, Milano, Emilia, Lutgen, Anne-Marie, Purschke, Christoph
The Universal Dependencies (UD) project has significantly expanded linguistic coverage across 161 languages, yet Luxembourgish, a West Germanic language spoken by approximately 400,000 people, has remained absent until now. In this paper, we introduce LuxBank, the first UD Treebank for Luxembourgish, addressing the gap in syntactic annotation and analysis for this `low-research' language. We establish formal guidelines for Luxembourgish language annotation, providing the foundation for the first large-scale quantitative analysis of its syntax. LuxBank serves not only as a resource for linguists and language learners but also as a tool for developing spell checkers and grammar checkers, organising existing text archives and even training large language models. By incorporating Luxembourgish into the UD framework, we aim to enhance the understanding of syntactic variation within West Germanic languages and offer a model for documenting smaller, semi-standardised languages. This work positions Luxembourgish as a valuable resource in the broader linguistic and NLP communities, contributing to the study of languages with limited research and resources.
- Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.05)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > Middle East > Iraq > Babil Governorate > Hillah (0.04)
- (8 more...)
Comparison of Epilepsy Induced by Ischemic Hypoxic Brain Injury and Hypoglycemic Brain Injury using Multilevel Fusion of Data Features
Kadem, Sameer, Sami, Noor, Elaraby, Ahmed, Alyousif, Shahad, Jalil, Mohammed, Altaee, M., Almusawi, Muntather, Ismaeel, A. Ghany, Kareem, Ali Kamil, Kamalrudin, Massila, ftaiet, Adnan Allwi
The study aims to investigate the similarities and differences in the brain damage caused by Hypoxia-Ischemia (HI), Hypoglycemia, and Epilepsy. Hypoglycemia poses a significant challenge in improving glycemic regulation for insulin-treated patients, while HI brain disease in neonates is associated with low oxygen levels. The study examines the possibility of using a combination of medical data and Electroencephalography (EEG) measurements to predict outcomes over a two-year period. The study employs a multilevel fusion of data features to enhance the accuracy of the predictions. Therefore this paper suggests a hybridized classification model for Hypoxia-Ischemia and Hypoglycemia, Epilepsy brain injury (HCM-BI). A Support Vector Machine is applied with clinical details to define the Hypoxia-Ischemia outcomes of each infant. The newborn babies are assessed every two years again to know the neural development results. A selection of four attributes is derived from the Electroencephalography records, and SVM does not get conclusions regarding the classification of diseases. The final feature extraction of the EEG signal is optimized by the Bayesian Neural Network (BNN) to get the clear health condition of Hypoglycemia and Epilepsy patients. Through monitoring and assessing physical effects resulting from Electroencephalography, The Bayesian Neural Network (BNN) is used to extract the test samples with the most log data and to report hypoglycemia and epilepsy Keywords- Hypoxia-Ischemia , Hypoglycemia , Epilepsy , Multilevel Fusion of Data Features , Bayesian Neural Network (BNN) , Support Vector Machine (SVM)
- Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
- Asia > Middle East > Saudi Arabia > Al-Qassim Province > Buraydah (0.04)
- Europe > United Kingdom (0.04)
- (6 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.87)
The $\mu\mathcal{G}$ Language for Programming Graph Neural Networks
Belenchia, Matteo, Corradini, Flavio, Quadrini, Michela, Loreti, Michele
Graph neural networks form a class of deep learning architectures specifically designed to work with graph-structured data. As such, they share the inherent limitations and problems of deep learning, especially regarding the issues of explainability and trustworthiness. We propose $\mu\mathcal{G}$, an original domain-specific language for the specification of graph neural networks that aims to overcome these issues. The language's syntax is introduced, and its meaning is rigorously defined by a denotational semantics. An equivalent characterization in the form of an operational semantics is also provided and, together with a type system, is used to prove the type soundness of $\mu\mathcal{G}$. We show how $\mu\mathcal{G}$ programs can be represented in a more user-friendly graphical visualization, and provide examples of its generality by showing how it can be used to define some of the most popular graph neural network models, or to develop any custom graph processing application.
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy (0.04)
- (9 more...)
- Research Report (0.64)
- Overview (0.46)
ProMoAI: Process Modeling with Generative AI
Kourani, Humam, Berti, Alessandro, Schuster, Daniel, van der Aalst, Wil M. P.
ProMoAI is a novel tool that leverages Large Language Models (LLMs) to automatically generate process models from textual descriptions, incorporating advanced prompt engineering, error handling, and code generation techniques. Beyond automating the generation of complex process models, ProMoAI also supports process model optimization. Users can interact with the tool by providing feedback on the generated model, which is then used for refining the process model. ProMoAI utilizes the capabilities LLMs to offer a novel, AI-driven approach to process modeling, significantly reducing the barrier to entry for users without deep technical knowledge in process modeling.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Iraq > Babil Governorate > Hillah (0.05)
- Africa > Rwanda > Kigali > Kigali (0.05)
- (9 more...)
LTL under reductions with weaker conditions than stutter-invariance
Paviot-Adet, Emmanuel, Poitrenaud, Denis, Renault, Etienne, Thierry-Mieg, Yann
Verification of properties expressed as-regular languages such as LTL can benefit hugely from stutter-insensitivity, using a diverse set of reduction strategies. However properties that are not stutter-insensitive, for instance due to the use of the neXt operator of LTL or to some form of counting in the logic, are not covered by these techniques in general. We propose in this paper to study a weaker property than stutter-insensitivity. In a stutter insensitive language both adding and removing stutter to a word does not change its acceptance, any stuttering can be abstracted away; by decomposing this equivalence relation into two implications we obtain weaker conditions. We define a shortening insensitive language where any word that stutters less than a word in the language must also belong to the language. A lengthening insensitive language has the dual property. A semi-decision procedure is then introduced to reliably prove shortening insensitive properties or deny lengthening insensitive properties while working with a reduction of a system. A reduction has the property that it can only shorten runs. Lipton's transaction reductions or Petri net agglomerations are examples of eligible structural reduction strategies. An implementation and experimental evidence is provided showing most nonrandom properties sensitive to stutter are actually shortening or lengthening insensitive. Performance of experiments on a large (random) benchmark from the model-checking competition indicate that despite being a semi-decision procedure, the approach can still improve state of the art verification tools.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)